An Improved Extraction Pattern Representation Model for Automatic IE Pattern Acquisition

نویسندگان

  • Kiyoshi Sudo
  • Satoshi Sekine
  • Ralph Grishman
چکیده

Several approaches have been described for the automatic unsupervised acquisition of patterns for information extraction. Each approach is based on a particular model for the patterns to be acquired, such as a predicate-argument structure or a dependency chain. The effect of these alternative models has not been previously studied. In this paper, we compare the prior models and introduce a new model, the Subtree model, based on arbitrary subtrees of dependency trees. We describe a discovery procedure for this model and demonstrate experimentally an improvement in recall using Subtree patterns.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparing Information Extraction Pattern Models

Several recently reported techniques for the automatic acquisition of Information Extraction (IE) systems have used dependency trees as the basis of their extraction pattern representation. These approaches have used a variety of pattern models (schemes for representing IE patterns based on particular parts of the dependency analysis). An appropriate model should be expressive enough to represe...

متن کامل

Japanese Information Extraction with Automatically Extracted Patterns

One of the central issues for information extraction (IE) systems is the cost of customization from one scenario to another. Research on the automated acquisition of patterns is important for portability and scalability. This paper explores the automatic extraction of patterns in Japanese from unannotated text. We introduce two modules of our system, the pattern extraction module and the inform...

متن کامل

A Task-based Comparison of Information Extraction Pattern Models

Several recent approaches to Information Extraction (IE) have used dependency trees as the basis for an extraction pattern representation. These approaches have used a variety of pattern models (schemes which define the parts of the dependency tree which can be used to form extraction patterns). Previous comparisons of these pattern models are limited by the fact that they have used indirect ta...

متن کامل

Automatic Discovery of Linguistic Patterns for Information Extraction

Information Extraction (IE) systems typically rely on extraction patterns encoding domain-specific knowledge. When matched against natural language texts, these patterns recognize with high accuracy information relevant to the extraction task. Adapting an IE system to a new extraction scenario entails devising a new collection of extraction patterns a time-consuming and expensive process. To ov...

متن کامل

On the Expressiveness of Information Extraction Patterns

Many recently reported machine learning approaches to the acquisition of information extraction (IE) patterns have used dependency trees as the basis for their pattern representations (Yangarber et al., 2000a; Yangarber, 2003; Sudo et al., 2003; Stevenson and Greenwood, 2005). While varying results have been reported for the resulting IE systems little has been reported about the ability of dep...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003